Introduction and Summary

For three decades, NASA-sponsored projects have led the astronomical community in a quest to digitally map the celestial sphere at wavelengths other than visible light. Due to these programs and those of our international colleagues, we now have at least crude digital images of the heavens over more than 16 orders of magnitude of frequency in the electromagnetic spectrum. The results of these observations have profoundly changed our understanding of every regime and class of object in astronomy, from the near-Earth environment to the most distant quasi-stellar objects and the primordial fireball.

It is interesting to compare the progress of visible-light surveys during this same interval. The pioneering National Geographic-Palomar Observatory Sky Survey was conducted on photographic emulsions through the interval 1949-1956, just prior to the beginning of the space age. Forty years later, the surveys at wavelengths other than visible light have in some sense far surpassed the more traditional branch of astronomy: we still have no visible digital sky survey based on modern detectors. It seems clear that full exploitation of NASA resources requires such a data base, and the lack of such a resource will become a greater and greater obstruction as future NASA projects continue to return ever more detailed and sophisticated digital views of the Universe.

This document describes a project which will create a comprehensive visible light digital photometric map of half of the northern sky to very faint magnitudes, together with an associated, voluminous and homogeneous spectroscopic survey. All data obtained by the project will become an open, public resource. Capital construction of the equipment, and preparation of software, is largely complete, and data-taking will begin in less than a year. In recognition of substantial funding from the Alfred P. Sloan Foundation, the project is called the Sloan Digital Sky Survey, and is referred to by the acronym SDSS in the following material. This proposal discusses specific parts of the SDSS which will directly impact a variety of NASA-sponsored programs, and requests support to ensure that the relevant results are correctly tailored and easily available to those projects.

Obviously the scientific impact of the SDSS data on NASA projects will be very large. However we also stress the substantial issues of innovative software technology of huge data bases. The SDSS data base will total more than 12 Terabytes, fully comparable to the Earth Observing System and the human genome project. By the time the survey is complete, SDSS will have invested more than 100 person-years of effort in development of sophisticated software for acquisition, reduction, manipulation, and archiving of the data base, and much of this software can be used in other contexts. For example, SDSS intelligent software tools for rapid and efficient cross-identification of astronomical objects between disjoint, very large data bases should have broad applicability both inside and outside of astrophysics, We hope and anticipate that software tools, as well as scientific data, enabled by this current proposal will be broadly adapted throughout the community.

The scientific issues addressed by the SDSS span the entire field of astronomy. There are many fundamental, but as yet unanswered, questions about the structure of the Universe and the cosmogony of its constituents, from stars, through galaxies to quasars. Recent advances in mapping the local Universe provide a glimpse of the rich structure in the galaxy distribution. Yet we lack knowledge of even the lowest order statistical properties of the distribution of matter on large scales -- the fluctuation spectrum of galaxies is poorly known on scales beyond 100 Mpc, a full decade below the comoving scale probed by COBE. Such a characterization is a necessary step to understanding the origins of large-scale structure. Our knowledge of the distribution and nature of the highest-redshift objects, quasars, is limited, as is our understanding of the intervening absorbing material. How and where galaxies form is in the realm of speculation, with continuing debate over the origins of the Hubble Sequence. Even the structure of our own Galaxy is controversial. In each of these cases, a major stumbling block is the lack of a uniform sample -- of stars, galaxies or quasars -- that has been reliably selected over a wide area of sky.

To produce the samples necessary to address these problems, we are carrying out a digital photometric and spectroscopic survey over a large fraction of the sky (10,000 deg² ) in the north Galactic cap, complete within precisely defined selection criteria, and a much deeper imaging survey in the southern Galactic hemisphere covering 225 deg² and a comoving volume roughly a quarter that of the northern imaging survey. The photometric map of the sky will measure accurate flux densities of objects almost simultaneously in five bands ( u', g', r', i', and z' ) with effective wavelengths of 3540 Å, 4760 Å, 6280 Å, 7690 Å, and 9250 Å, complete to limiting (5:1 signal-to-noise) point source magnitudes of 22.3, 23.3, 23.1, 22.3, and 20.8, (ABnu) respectively in the North Galactic cap. The survey sky coverage of about pi steradians will result in photometric measurements to the above detection limits for about 5x107galaxies and a somewhat larger number of stars. The morphological and color information from the images will allow robust star-galaxy-quasar separation, yielding a photometric sample of about 106quasar candidates. Astrometric positions will be produced which we believe will be accurate to of the order of 50 milliarcseconds for sources brighter than about 20.5. Medium resolution spectra will be obtained for the 106galaxies brighter than about r'~18.1, approximately 105quasars brighter than g'~ 19.7, and carefully selected samples of stars. The deeper southern survey will go about 2.0 magnitudes fainter in all bands, and, in addition to being a bridge between the main northern survey and much deeper pencil-beam surveys possible with very large telescopes, will contain a wealth of information about faint variable sources, supernovae, and proper motions. The SDSS will make a substantial indirect contribution to all such very deep surveys by characterizing the nearby Universe in a detailed and quantitative manner. Without this information, which does not exist in any satisfactorily accurate and complete form at present, one cannot compare the Universe observed at great distances and ancient times to the Universe today.

These data will allow investigation of specific questions such as discussed above. The galaxy spectroscopic survey will yield a three dimensional map of the Universe to a redshift of about z = 0.2 , many times the volume of the largest structures predicted by current theories of structure formation or observed in existing redshift surveys. Thus we will be able to measure the power spectrum of galaxy density fluctuations on scales up to 10³ Mpc and with far greater precision than possible before. On much larger scales, the 105 quasars will allow the investigation of angular structure up to the present horizon scale and the evolution of that structure over most of the age of the Universe. Using the stellar data, it will be possible to determine the distribution, kinematics and nature of blue stars in the Galactic halo, resulting in constraints on the star formation history of this component of the Galaxy and on the Galactic potential well. These are but examples; Chapter 3 of this proposal describes the scientific rationale for this survey in detail. We emphasize, however, that the funding we request from NASA is not to support our conduct of this scientific program, or even the capital construction and software preparation, but rather only to optimize the software and data base for use in NASA-related programs.

This project will be the first large-area photometric survey to use CCD detectors. Why now? The venerable Palomar Observatory Sky Survey has been the backbone of many astronomical investigations for the past forty years. Photographic plates lack the sensitivity, linearity, and dynamic range of CCD detectors, resulting in biases and incompletenesses in photographic surveys. However, due to their large area, photographic plates have retained their competitive edge over CCDs until very recently. We now have the technology to produce working arrays of very large-area CCDs, resulting in a detector system which, as we describe below, is far superior to large photographic plates for gathering optical survey data.

A successful survey requires that we build a dedicated, special-purpose 2.5 meter telescope. Our design has a wide, well-corrected field (3°), equipped with a large focal plane CCD array for photometry and with a pair of double fiber fed spectrographs which allow the measurement of 640 spectra per field covering 3900 Å -- 9200 Å at a resolving power lambda / Delta lambda of about 2000. The detectors for the photometric survey are an array of thirty large-area ( 2048 x 2048 ) CCDs. The data will be taken in scanning (time-delay-and-integrate, or TDI) mode, which will effectively use all of the observing time to gather data. The pixel size ( 0.4" ) and optical quality of the telescope is such that the resolution will be seeing-limited. There will in addition be an array of astrometric CCDs which will eventually allow tying the photometric survey to the Hipparcos reference frame to an accuracy of the order of 50 milliarcseconds.

The gain we achieve with our survey design may be quantified as follows: a measure of the information rate for surveys is the survey efficiency epsilon :

where Omega is the solid angle of sky on the detectors, D the diameter of the telescope, and q the detector quantum efficiency. The time to complete a survey to a given limit and signal-to-noise ratio is clearly inversely proportional to epsilon . The 48-inch Schmidt has an epsilon of about 0.30m² deg² , allowing an (optimistic) photographic quantum efficiency of half a percent. The proposed SDSS telescope with its CCD array has an efficiency more than an order of magnitude larger, about 4.8 in these units, and furthermore allows mapping the sky almost simultaneously in five bands.

This project represents a new departure in that the photometric catalog for selection of the objects whose spectra will be measured will be done concurrently with the much more time-consuming spectroscopic survey. This strategy is designed to make optimum use of the observing time, as excellent photometric conditions (in terms of seeing, sky brightness and sky transparency) are present for only a minority of the time even at the best sites. The photometry will be done only in the best seeing conditions; spectroscopy will be carried out on less pristine nights. The hardware design allows for rapid changeover between photometric and spectroscopic modes, so that both can be done on the same night if the weather changes. This strategy allows the efficient and appropriate use of partial nights, including nights when the Moon is up for part of the night. Execution of this strategy requires almost real-time production of a well-defined catalog of galaxies, quasars and stars, to be achieved by reliable, fast software. A full year to test hardware and software systems is incorporated into the observing plan; we expect the survey itself to take five years to complete.

The database that will result from the SDSS will be enormous; a processed pixel map of the whole region at 0.4 arcsecond resolution is about 8.2 Terabytes, and the extracted spectra alone about 50 Gigabytes. We plan to publish an atlas of images of detected objects from the imaging survey and images of sources from all available catalogs from other wavelength regions which have sufficiently accurate positions. We will publish a low-resolution version of the whole SDSS, the spectra, and a variety of catalogs constructed from the data on CD-ROMs (or some equivalent and, hopefully, more capacious medium available at the time). We will make this database available at cost to the community, and will publish the data in two stages. The first stage will include the results of the first two years of the SDSS; the second release will include the entire survey. Comparison of the SDSS with results of large-area surveys in other wavelength ranges will allow construction of a unified astronomical database. Much as the Palomar Observatory Sky Survey has served the astronomical community for the past several decades, and has been so valuable that the plates from both the first and second-generation surveys have been digitized, we expect that the SDSS will provide imaging material (in well calibrated digital form) for the new generation of large telescopes; in particular, the very deep southern survey will contain objects fainter than even the largest planned ground- or space-based telescopes are likely to be able to explore spectroscopically. Dissemination of these data to the community will no doubt result in discoveries which we have not anticipated in the extensive discussion of survey science (Chapter 3) in the present proposal.

In addition, we anticipate that, as the survey progresses, there will be a substantial amount of time available for other projects on a proposal basis using the SDSS hardware and software. We anticipate (and hope) that the technology developed as part of this project, particularly the optics and detectors and associated data acquisition electronics but also the quite considerable software development, will be of use to the community; we will make available all the software used to generate the data base.

We turn now to a guide through the voluminous material in this proposal, which is divided into three parts. Volume I develops the scientific motivation and rationale for the SDSS in some detail. The SDSS covers a truly vast range of science, and we discuss this from two points of view. Chapter 2 describes the synergy between SDSS and future large NASA projects, with emphasis on surveys. In particular, we describe some of the exciting science we might expect from an intercomparison of the SDSS data base with the X-ray survey produced by ROSAT, the 1.3 - 2.2 µm data produced by 2MASS and the 12/25 µm data from WIRE. We also briefly discuss the software and education fallout from the SDSS and its comparison with large NASA surveys. The data bases are simply too vast to be examined or analyzed by conventional techniques, and a major effort is underway to build database access tools to make analysis of the SDSS data possible. Such tools can be used by a much larger number of individuals than the scientists at the SDSS institutions or even the world wide community of astronomers; with appropriate modifications and adaptions, they will allow us all to become armchair, or, better, desktop, astronomers and to make real discoveries in the SDSS and other databases. A most interesting aspect of this is the adaption of the software and data bases to be a teaching tool in elementary school; this is described in Section 2.4.

Chapter 3 takes another point of view, and describes the astronomical impact which we expect SDSS to have. This chapter discusses several large astronomical areas, and places some emphasis on the gains to be made from intercomparison of the SDSS and large NASA data bases. First, we describe the fundamental science for which the SDSS is designed; the measurement of the large-scale structure of the Universe. As we will see, the combination of SDSS and current and planned microwave background measurements will be a very powerful, and probably definitive, probe of the evolution of structure in the Universe. Chapter 3.2 discusses what we hope to learn from SDSS observations of clusters of galaxies; there are wonderful opportunities to study the evolution of the clusters from a comparison of SDSS photometric and spectroscopic data with the results of the ROSAT survey. Section 3.3 describes the second main emphasis of the SDSS, a vast study of quasars. This chapter also contains a discussion of X-ray and infrared emission from quasars. The SDSS will enable very large studies of galaxies, of Galactic structure and of stars (Sections 3.4, 3.5 and 3.6). The possibility for serendipitous discoveries and the impact of the SDSS on studies of the Solar System (in particular of small bodies such as those in the Kuiper Belt) and of the interstellar extinction are described in the last three sections of this chapter. In Appendix A we compare the SDSS to existing and planned surveys in an attempt to place our proposed work in the context of current investigations of large scale structure and galaxy evolution. While the SDSS covers only a quarter of the sky, does not image the Galactic plane, and saturates at 14m, it is nevertheless possible to design a strategy using the SDSS telescope and camera which will produce a complete stellar photometric survey. Possible plans for doing this are described in Appendix B.

Volume II describes the design and construction of the SDSS in detail. Chapter 4 acts as an introduction by describing the present (December 1 1996) status of the project construction. Chapter 5 contains a detailed discussion of the survey strategy. Here, we consider such issues as how we can accomplish the scientific goals of the Survey outlined in Chapter 3; the required photometric and spectroscopic sensitivity, resolution and accuracy; the choice of the photometric system; the choice of the area of the sky to be surveyed; and the required depth and integration times. We consider how we can most efficiently carry out the survey, and also several aspects of the strategy as it relates to the survey of the southern Galactic cap. The dedicated 2.5 meter telescope is described in Chapter 6. The mechanical design and expected performance, and the (somewhat novel) enclosure are discussed as well as the optical design, which allows excellent imaging over a wide field with negligible distortion and chromatic aberration. The observatory site, Apache Point, New Mexico, (where the ARC consortium is currently operating a 3.5 meter telescope) is described next in Chapter 7; we discuss the construction and facilities for the SDSS telescope, as well as the expected astronomical properties of the site (the seeing, darkness and frequency of photometric weather). The photometric camera, with its array of 54 CCD detectors on the focal plane, is described in Chapter 8. Details of the five-color filter system are also presented here. Chapter 9 contains a discussion of the photometric calibration of the SDSS. Since the photometric survey is designed to find objects whose spectra will then be observed with fiber spectrographs, we need to produce positions accurate to better than 0.2 arcseconds in almost real time. As it turns out, we may be able to do much better than this, so that the SDSS will have considerable astrometric value as well. The proposed astrometry is described in Chapter 10. We then turn to a description of the CCD fiber spectrographs in Chapter 11. The galaxies are clustered, i.e. do not have a uniform area coverage on the sky. In order to measure spectra of galaxies to a uniform magnitude limit, we have conceived a system of adaptive tiling, whereby the spectroscopic fields to be observed overlap to a greater or lesser extent depending on the local surface density of galaxies. This scheme is described in Chapter 12. Chapter 13 describes the extensive simulations of the SDSS data which have been made to aid the planning and software development. Chapter 14 discusses our data-taking system and the series of software pipelines that will automatically reduce the data. Chapter 15 describes how the survey will be operated. Appendices C and D contain fairly detailed discussions of the architecture of the data archive and of the database software.

The third volume of the proposal deals with administrative matters. The management is described in Chapters 16. This document is a revision, update, and extension of a successful proposal submitted in 1993 to the National Science Foundation, which resulted in a grant for ~10% of the total project cost. As noted above, this current proposal is by contrast for the operations phase of the project only, and will enable those functions of the project that are crucial to support NASA-related activities.